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Abstract — In this paper, the minimum distance distribution 
of irregular generalized LDPC (GLDPC) code ensembles is 

', investigated. Two classes of GLDPC code ensembles are analyzed; 
in one case, the Tanner graph is regular from the variable 
node perspective, and in the other case the Tanner graph is 
completely unstructured and irregular. In particular, for the 

, former ensemble class we determine exactly which ensembles 
have minimum distance growing linearly with the block length 

' with probability approaching unity with increasing block length. 

' This work extends previous results concerning LDPC and regular 

, GLDPC codes to the case where a hybrid mixture of check node 

I types is used. 

I. Introduction 

Recently, the design and analysis of coding schemes repre- 
' senting generalizations of Gallager's low-density parity-check 

(LDPC) codes H] has gained increasing attention. This interest 
' is motivated above all by the potential capability of these 

coding schemes to offer a better compromise between waterfall 
\ and error floor performance than is currently offered by state- 

■ of-the-art LDPC codes. 

' In the Tanner graph of an LDPC code, any degree-s check 

■ node (CN) may be interpreted as a length-s single parity- 
[ check (SPC) code, i.e., as an (s, s — 1) hnear block code. The 
'first proposal of a class of linear block codes generalizing 

LDPC codes may be found in [2], where it was suggested 
to replace each CN of a regular LDPC code with a generic 
\ linear block code, to enhance the overall minimum distance. 
• The corresponding coding scheme is known as a regular 
] generalized LDPC (GLDPC) code, or Tanner code, and a CN 
'that is not a SPC code as a generalized CN. More recently, 
irregular GLDPC codes were considered (see for instance 0|). 
For such codes, the variable nodes (VNs) may exhibit different 
degrees and the CN set is composed of a mixture of different 
linear block codes. 

In this paper, we present results on the minimum distance 
distribution of two classes of GLDPC code ensembles. It 
is shown that for the considered VN-regular ensembles, the 
ensembles for which the minimum distance grows linearly 
with the block length with probability approaching unity (with 
increasing block length) are precisely those which have good 
growth rate behavior as defined in ID. For the unstructured 
irregular GLDPC ensembles, we provide an upper bound on 



the probabihty of the minimum distance lying below a certain 
fraction of the code's block length. 

II. Preliminaries and Notation 

In this work, we will consider two GLDPC code ensembles. 
These ensembles share definitions from the CN perspective, so 
we begin by giving these definitions. 

We define a GLDPC code ensemble as follows. There are 
TLc different CN types i G /c = {1, 2, . . . , ric}. For each CN 
type t G Ic we associate a local code denoted by Ct, and we 
denote by kt, st and rt, the dimension, length and minimum 
distance of Ct, respectively. For t E Ic, pt denotes the fraction 
of edges connected to CNs of type t. The polynomial p{x) is 
defined by p{x) = Y^teia Pt^"'^^- 

If E denotes the number of edges in the Tanner graph, 
the number of CNs of type t E Ic is then given by Ept/st- 
Denoting as usual p{x) Ax by / p, it is easily deduced that 
the number of CNs is given hy m = E J p. Therefore, the 
fraction of CNs of type t E Ic is given by 

Pt 



It = 



stj P 



(1) 



The parity-check matrix for CN type t E Ic is denoted by Hj. 
The weight enumerating function (WEE) for CN type t E Ic 
is given by 



= 1 + 



St 



/!(*). 



«=0 



Here A^*^ > denotes the number of weight-u codewords for 
CNs of type t. We assume that the local codes associated with 
all CNs have minimum distance of at least 2 (i.e., rt > 2 for 
t E Ic). 

For ensembles which have a positive fraction of CNs with 
minimum distance 2, the parameter C is defined by 



C 



t:rt=2 



PtAl 

St 



(2) 



The number of VNs, which is also equal to the overall block 
length of the ensemble, is denoted by N. The two ensembles 
differ from the perspectives of VN distribution and Tanner 



graph interconnectivity. We next provide the further definitions 
for these two ensembles separately. 



Ensemble 2 is VN-irregular (i.e., in general, the VNs do not 
all have the same degree). 



A. Ensemble 1 

Ensemble 1 is an extension of the definition given in JS] and 
in E, Q for regular GLDPC codes to the hybrid CN case. 
The overall parity-check matrix of the code is a formed by 
vertically concatenating q>2 block rows Hf, i ~ 1, 2, . . . , q. 
The first block row Hi is a block-diagonal matrix, whose 
diagonal elements consist of ■^tm/q matrices tit for each 
t € Ic- These are the parity-check matrices of the constituent 
codes associated with the Uc CN types. Each of these parity- 
check matrices is repeated 7tm/q times along the diagonal. 
The resulting matrix Hi, which forms the first of the q block 
rows of the parity-check matrix for the GLDPC code, is given 
by 

/Hi ••• ••• \ 
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The other g — 1 block rows H2 



, Hg are formed by 



performing random column permutations 112 , 112 , . . . , Ilg on 
Hi. Stacking the block-rows on top of one another results in 
H, the parity-check matrix of the GLDPC code. 

The ensemble is defined according to a uniform probability 
distribution on all permutations , for every £ — 2,3, . . . , q 
(together with independence of these permutations). Note that 
the Tanner graph for this ensemble is VN-regular, i.e., all VNs 
have the same degree q. The design rate R for the irregular 
GLDPC codes in Ensemble 1 is given by 



(3) 



B. Ensemble 2 

This second ensemble we consider is an a generalization 
of the unstructured irregular LDPC ensemble analyzed in ||8], 
and is also a special case of the unstructured irregular doubly- 
generalized LDPC ensemble analyzed in ||9l, El- Here At 
denotes the fraction of edges connected to VNs of degree d, 
where d G {2, 3, . . . , d„}. The polynomial \{x) is defined by 
A(a;) = ^dx'''^^- We denote as usual A(a;)dx by / A. 

The node-perspective VN degree distribution is defined as 

Ad 



Ad 



d/A- 



(4) 



Here Ad is the fraction of VNs having degree d. 

The ensemble is defined according to a uniform probability 
distribution on all permutations connecting the E edges of the 
Tanner graph. Note that whereas Ensemble 1 is VN-regular, 



III. Minimum Distance Results for Ensemble 1 

In IS), |13, a lower bound on the minimum distance of a reg- 
ular GLDPC code was found (generalizing the corresponding 
result for LDPC codes in IfTOl ). Following a similar approach 
yields the following theorem in the case of Ensemble 1 . 

Theorem 1: Let dmin the minimum distance of a GLDPC 
code picked randomly with uniform probability from Ensem- 
ble 1 described above. Then 



Pr(d,„in < a*N) -^0 as iV ^ 00 



(5) 



where a* is the smallest solution in the interval (0, 1) to the 
two equations 



F{z, a) = and )^ = 



(6) 



for some positive real value z, and where 

F{z, a) = {q- l)h{a) -qj pj^^t \n[A^*'^ (z)] + qa \n{z) . 

t€la 

(7) 

Here h{a) = —a\n{a) — (1 — a) ln(l — a) denotes the binary 
entropy function in nats. Furthermore, for q > 2, such an a* 
always exists, while for q ~ 2 such an a* exists if and only 
if C < 1, where C is given by (|2]l. 

Proof: Let Pi{d) denote the probability that a length- 
vector c which satisfies cHi"'^ = (i.e., which satisfies the 
parity checks in the first block row Hi) has Hamming weight 
d. The generating function for this sequence is 



N 



Itm/q 



d=0 



tela 



where tp^^'>{z) ~ yl(*'(z)/2'^' is the moment generating func- 
tion of the Hamming weight of a codeword in Ct- 
Since Pi (d) > Vd, we can write (for any z > 0) 

Pi {d) < cxp ( - ^ 7t ln[^(*) (z)] - d log z j . (8) 

Next, let Fi (d) denote the probability that a length- vector 
of Hamming weight d satisfies the parity checks in the first 
block row Hi of H. An upper bound on this probability is 
readily deduced as 



Fi{d) < 



cxp I — ^ 7t ln[A(*)(z)] - dlogz j . 



N 



d J \ q 



tela 



Any vector which satisfies all q of the block-rows of H is a 
valid codeword for the GLDPC code, and satisfaction of the 
different block rows are independent events. Thus, an upper 



bound on F{d), the probability that a length-A^ vectors of 
weight d is a codeword of the GLDPC code, is given by 



F{d) - [F,{d)]'^ 
< 



cxp j m 7t ln[A''*-' (z)] — qd log ; 



The expected number of codewords of weight d, for a code 
chosen uniformly at random from Ensemble 1, is thus 



M{d) 



< 



F{d) 

-(9-1) 



exp TO ^ 7t ln[A'^*^ (z)] - qd log ; 



We use this to bound the probabiUty of the event dmin ^ d^. 
Using Markov's inequality, 



<do)<Y,M{d) 



d=l 



< do max 

l<d<d| 




-(9-1) 



X exp j m ^ 7t ln[A^*^ (z)] — qd log ; 
Using the relation IfTOl 

N 

exp A^/i 



> 



8d{N - d) 
and letting a = d/A^, leads to 



Pr(rfmin < rfo) < max cxp[-Afi^(z,a) +o(iV)] (9) 

where F{z,a) is given in (|7]i, and we have used the fact 
that the total number of edges in the Tanner graph is £" = 
m/ J p = Nq. Here 

o{N) ^ In [do {8Na{l - a))"^ 

Note that o{N)/N as ^ oo. 

To prove that Pr((i,„in < a^N) -> as -> oo for a 
given ao, it suffices to find z > such that F{z,a) > for 
all < a < ao; in this case, the bound in (|9|i guarantees a 
decrease of Pr((i,„i„ < ao-^) to zero with increasing N. The 
best bound is obtained when F{z, ao) ~ 0, as otherwise we 
could increase ao and obtain a similar bound for a larger value 
than ao (using the same z). Also, the best bound is attained 
when F{z, ao) is maximized with respect to z, since otherwise 
F{z, ao) could be increased sUghtly (made positive), and the 
bound obtained for a larger ao as explained in the previous 
point. 

We next note that the value of a* occurring in the solution of 
^ exactly matches the value of the critical exponent codeword 
weight ratio which occurs in |4, Theorem 4.2] in the context 
of the GLDPC ensemble considered therein. To see this, note 
that the constraint dF(z, a)/dz — leads to ^{z) = a where 
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Fig. 1. Ratios of minimum distance to block length for irregular GLDPC 
codes, plotted against the design rate R of the ensemble. 



f( ) is defined in lH Definition 4.1], and then the constraint 
F(z, a) = leads to G(a*) = 0, where G{a) denotes 
the growth rate of the weight distribution as defined at the 
beginning of ID Section III], and where the relation f(z) = a 
is impUcit. From the analysis of this function G{a) conducted 
in 191 we know that a solution to G(a*) = with a* G (0, 1) 
exists if and only if either g > 2 or g = 2 and C < 1 (this is 
the necessary and sufficient condition for G"(0) = — oo and 
G"(0) < 0, respectively, and it is easy to show that G(a) must 
be positive for some a > 0). ■ 

It is interesting to note that, while the ensemble considered 
in this latter result is also VN-regular, and the ensemble 
definitions match from the CN side, they are in fact slightly 
different ensembles. 

Example 1: In this example, the ensemble relative mini- 
mum distance a* for example irregular GLDPC code en- 
sembles of ensemble type 1 are evaluated using Theorem [l] 
and plotted against the design rate of the ensemble (as given 
by (O)- these results are plotted in Figure [1] Here we use 
Hamming (63,57), (31,26) and (15,11) codes as the local 
codes at the CNs. Note that for Hamming codes of length st, 
we have 

A<*)(.)=(^((i + .r 

+St(l + z)(^'-l)/2(l-z)(-^' + l)/2) . 

In each case, two of the three code types are used (/c = 
{1,2}), with 7i being varied between and 1 - this results 
in the blue curves joining pairs of points corresponding to 
regular GLDPC code ensembles. Note that while mixing CN 
types tends to bring us further from the Gilbert-Varshamov 
bound, the threshold of the ensemble will be optimized for 
some mixture of CN types. 



IV. Minimum distance results for Ensemble 2 

In this section, we analyze the minimum distance distribu- 
tion of Ensemble 2. We will assume throughout this section 
that the GLDPC ensemble under consideration has at least one 
CN type with minimum distance equal to 2. The results of this 
section rely on the following basic Lemma. 



Lemma 1: For any positive integer j. 



lim Coef 

N-i-oo 



,X 



2j 



{EC/2y 



(10) 



where Coef [/(x), x*] denotes the coefficient of in the 
Taylor expansion of f{x) (as in ifTTl ). and the parameter C is 
given by ([2]). 

Proof: The proof is notationally cumbersome, so we 
present it for the example where Ic = 2, with A^^\x) = 
1 + A2x'^ + A^x^ and A'^'^'>{x) = 1 + Sax^ + B^x^. Then 

P{x) = (1 + A2X^ + A3X^y"{l + B2X^ + ^42;'')''^'". 

The multinomial theorem then gives 



P{x) 



E 



71 "M / 12m 
i2 h) U2 ii 



X Aii Bf ^"^^^ 



The coefficient of in P{x) is then given by 

Cocf [P(x),x2j] = 



2(i2+j2) + 3i3+4j4 = 2j 



71"^ \ / l2'm 
h is J \j2 ji 



X A^A^^Bf (11) 
Consider first the sum of all terms where — = 0. This is 



^1 



E 

l2+j2=j 
j 



E 



As N ■ 



12=0 

00 we have 



71 TO 
12 



J2 
72 TO 

j - h 



^i^E 



(7ito)'2 (72to)-'" 



12=0 



^2! (.7 - i2)! 



12 T3J-1-2 ■ 



-Al?B: 



1 ^ 

---T 

i2=0 ^ ^ 



{^imA^Y'^ (72toS2)^" 



— ■(7lTO^2 + 72TOi?2)'' 



(12) 



Note that this term is Q{m^) as n 00. In general, the 

(«2, j2,«3, j4) term is 



e((7lTO)''^+*-^(72TO)^'^+^'l) 



0(^*2 +*3+j2+j4) 

e(TO«) 



where the exponent k satisfies 

^ _2{i2 + 13 + J2 + 34) ^2{i2 +j2) + Sis + 4j4 



and we conclude that k < j, with strict inequality unless 13 = 

J4 = 0. 

Since m-' terms dominate all terms m'^ with k < j, the 
limiting expression for (fTH as 00 involves only those 

product terms in P{x) for which rt = 2. Therefore, in general 
we obtain 



lim Coef 

N-^oo 



n [^'''(-) 

.tela 



7fm 



r2j 



1 



7tTO^ 



(t) 



Vt:rt=2 



^ St f p 

.t:rt=2 ^ J ^ 



Next we use this result to generaUze ifTTI Lemma 9] as 
follows. 

Theorem 2: For GLDPC codes Ensemble 2, we have 

A' 



lim Pr(c?n 

N^oo 



1) = 1 — cxp 



Proof: We restrict ourselves to considering only degree-2 
variable nodes (this may be justified in a manner similar to 
that described in the proof of ifTTI lemma 9]). Recall that there 
are A271 of these VNs. 

From Lemma [U in the special case j = 1, we have 



lim Coef 



,x 



It el a 



EC 



Let Ai denote that event that VN {vi] is a codeword (of 
Hamming weight 1) of the GLDPC code: 

Ai ={{vi] is a codeword} 
= {{v^}eC}, 

Then Pr(c?inin = 1) may be written as a union of such events, 
which may then be expanded using the inclusion-exclusion 
principle: 

Pr(rfmi„ = 1) =Pr(U.^,) 



(13) 



The general term in this alternating sum is the sum, evaluated 
over all sets V2 = {vi-^jVi^, ■ ■ ■ ,Vi } of j degree-2 VNs, of 



the probability that all VNs in the set V2 individually form 
codewords (of Hamming weight 1), i.e.. 



E 

l<ii<i2< - <ij <A2n 



X2n 
j 



Pr({«.J,^,J,...,{^;,J GC) 
{Coeff [n,,.jAW(x)]^'",.2]} 



\2 2 2--- 2) 



In the fraction above, the denominator is the number of ways 
of choosing j pairs of edges in the Tanner graph, while the 
numerator counts the number of these such that each pair 
individually satisfies the CN constraints (i.e., when Is are 
placed on this pair of edges, and zeros on the other edges). In 
the limit as 00 we obtain (invoking Lemma [U 



lim y 

l<ii<i2<---<ij<>.2n 
I 3 



Pr({w,J,{v,J,...,{u,J e C) 



-V 



A'(0)C 
2 



(14) 



where we have made use of the fact that 

~ \2E X'(0)E 
^^"= — = 

Substituting this result into (T3[ and using the Taylor series of 
the exponential function yields the result of the Theorem. ■ 

Next we prove an upper bound on the probability of the 
minimum distance for a GLDPC code, which generalizes ifTTl 
Lemma 22]. 

Theorem 3: For GLDPC code Ensemble 2, 



Pr(dmin < a*N) < 



1 



v/l-A'(0)C 

Proof: We denote by V and C the set of length-iV binary 
vectors and the set of codewords, respectively. Consider events 
{S e C}, where 5" C V. The probability that dmin < a*N 
for a code is equivalent to the probability that for all possible 
S with 15*1 < a*N, at least one of them is a member of C, 
i.e. the probability of the union of the events {S* G C} where 
IS"! < a*N, i.e.. 



Pr(rfmin <a*N)^ Pr 
Using the Union bound. 



U iSeC} 

S,\S\<a'N 



Pr 



U 6 C} 

S,\S\<a'N 



S,\S\<a'N 
a'N 

j=l S.\S\=j 



(15) 



Since all of the summands in the innermost summation in (flST l 
are equal, we have 



E Pr(^eC) 

S,|S|=j 



X2n 
j 



Coef 



(I) 



In the fraction above, the denominator is the number of ways 
of choosing 2] edges in the Tanner graph, while the numerator 
counts the number of these such that placing Is on these edges 
and Os on the other edges, satisfies all of the CN constraints. 
Taking the limit as iV — > 00 and invoking Lemma [T] we obtain 



\'(0)E 
2 



J'- 



j 



lim y PrfS* e C) = 

S,\S\=3 

where we have again used (fT4l i. Therefore, 

„i^„<a-N)<t{J){^)\ (16) 

The generating function for the central binomial coefficient is 
given by lfT2l 



E 



1 



(17) 



Finally, inserting ( fTTI i into (fTSI l we arrive at the statement of 
the theorem. ■ 
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